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UNIFORMLY ROOT-AT CONSISTENT DENSITY ESTIMATORS 
FOR WEAKLY DEPENDENT INVERTIBLE LINEAR PROCESSES 

By Anton Schick^ and Wolfgang Wefelmeyer 

Binghamton University and University of Cologne 

Convergence rates of kernel density estimators for stationary time 
series are well studied. For invertible linear processes, we construct 
a new density estimator that converges, in the supremum norm, at 
the better, parametric, rate n"^''^. Our estimator is a convolution of 
two different residual-based kernel estimators. We obtain in particu- 
lar convergence rates for such residual-based kernel estimators; these 
results are of independent interest. 

1. Introduction. The usual estimators for the density of a stationary pro- 
cess are kernel estimators and their recursive versions. Rates of convergence 
and pointwise central limit theorems have been studied under various mix- 
ing conditions by Robinson [24], Chanda [8], Castellana and Leadbetter [7], 
Masry [19, 20, 21, 22], Tran [39, 40, 41], Roussas [27, 28, 29], Cai and Rous- 
sas [6], Ango Nze and Portier [2], Ango Nze and Doukhan [1], Ango Nze and 
Rios [3], Doukhan and Louhichi [11] and Dedecker and Merlevede [10], and 
for linear processes by Hall and Hart [14], Tran [42], Hallin and Tran [15], 
Coulon-Prieur and Doukhan [9], Honda [16], Lu [18], Wu and Mielniczuk 
[43], Bryk and Mielniczuk [5] and Schick and Wefelmeyer [36, 37]. Under 
appropriate conditions, the convergence rates of these kernel estimators are 
the same as for independent and identically distributed observations. 

Linear processes are written as linear combinations of independent in- 
novations and the stationary density can be represented as a convolution 
of other densities in many different ways. We use the simplest such repre- 
sentation and estimate the stationary density by plugging in residual-based 
estimators of the densities involved in the representation. We expect this 
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to lead to faster, parametric rates of convergence. This is already known in 
nonparametric models with i.i.d. observations. Frees [12] shows that his plug- 
in estimators for densities of certain functions q{Xi, . . . ,Xm) are pointwise 
n^/^-consistent. Saavedra and Cao [32] consider the special case q{Xi,X2) = 
Xi + aX2- Schick and Wefelmeyer [34, 38] prove functional convergence for 
q{Xi, . . .,Xm) = ui{Xi) H h UmiXm) and q{Xi,X2) = Xi + X2, view- 
ing their estimators as elements of Li or of the space Co(M) of continuous 
functions on M vanishing at infinity. Gine and Mason [13] obtain functional 
results in Lp, locally uniformly in the bandwidth, for general q{Xi, . . . , Xm)- 
Special cases of the semiparametric time series model considered here have 
also been studied. Saavedra and Cao [31] consider pointwise convergence of 
plug-in estimators for the stationary density of moving average processes of 
order one. Schick and Wefelmeyer [33] obtain asymptotic normality and effi- 
ciency and Schick and Wefelmeyer [35] generalize this result to higher-order 
moving average processes and to functional convergence in Li and Co(M); 
see below for details. Here, we consider general invertible linear processes 
and obtain n^^^-consistency of our estimator for the stationary density in 

Com- 

Specifically, we consider a stationary linear process with infinite-order 
moving average representation 

00 

(1.1) Xt = et + Y,^set-s, t£Z, 

s=l 

with summable coefficients (ps and independent and identically distributed 
(i.i.d.) innovations £t, t € 7,, having mean zero and finite variance. If the 
innovations have a density /, then Xq has a density, say h. The usual esti- 
mator of this density from observations Xi, . . . , Xn of the linear process is a 
kernel density estimator 

1 " 

/i(x) = - V/cfe„(x-Xj), xeM, 

n 

where k^^ = k{x/bn)/hn for some kernel k (an integrable function that inte- 
grates to 1) and some bandwidth 5„ (tending to 0). 

Our goal is to construct an n^/^-consistent estimator of h. For this, we 
set 

00 

Yt = Xt- et = ^iPset-s, teZ. 

s=l 

We must exclude the degenerate case that the observations are i.i.d.: 

(C) At least one of the moving average coefficients tps is nonzero. 
Yq then has a density, say g. We have Xq = eo + Yq. Since Yq is indepen- 
dent of So , we can express the density h of Xq as the convolution h = f *g 
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/ and g. We obtain an estimator oihash = f*g, where / and g are estima- 
tors of / and g. We base these estimators on estimators of the innovations. 
For this, we require invertibihty of the process. 

(I) The function (p{z) = 1 + X^^i fsz'^ is bounded and bounded away 
from zero on the complex unit disk {zGC:|z|<1}. 

p{z) = l/4){z) = 1 — X^^i QsZ^ is then also bounded and bounded away 
from zero on the complex unit disk. Hence, the innovations have the infinite- 
order autoregressive representation 

oo 

(1.2) £t = Xt-Y,QsXt^s, tez. 

s=l 

Let Pn be positive integers with Pn/n 0. For j = p„ + 1, . . . , n, we mimic 
the innovation ej by the residual 

Pn 

ij = Xj — ^ QiXj-i, 
1=1 

where Qi is an estimator of Qi for i = 1, . . . ,pn. We then estimate the inno- 
vation density by a kernel estimator based on the residuals, 

1 " 

f{x) = V h,^{x-ij), xeR, 

and we estimate the density 5 by a kernel estimator based on the differences 
Yj = Xj — ij , 

1 " 

g{x) = Kix-Yj), xeR. 

n — Pn ■ 1 

In addition to (C) and (I), we use the following assumptions: 

(Q) the autoregression coefficients satisfy J2s>pn\8s\ = 0(^^~^/^~^) for 
some C > 0; 

(R) the estimators Qi of the autoregression coefficients Qi satisfy 

Pn 

1=1 

for some qn with 1< qn<Pn', 

(S) the moving average coefficients satisfy < 00; 

(F) the density / has mean zero, a finite fourth moment, is absolutely 
continuous with a bounded and integrable (almost everywhere) derivative 
/', and the function x xf'{x) is bounded and integrable. 
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The usual estimators of the autoregression coefficients are the least squares 
estimators ^i, . . . , Qp^ which minimize Z]j=p„+i(^i ~Z]f=i Qi^j-if' ■ By Lemma 1, 
they meet condition (R) with q„ = p„ if, in addition, 

(1.3) npn ^ qI^^ 

S>Pn 

holds. For smooth parametric models for the autoregression coefficients, we 
even have (R) with (/„ = 1, as shown in Section 2. 

We denote the number of nonzero coefficients among {(ps's> 1} by 

iV = ^ /O]. 

S>1 

We can then express (C) as > 1. If is finite, then (S) holds and the 
autocorrelation coefficients decay exponentially. Moreover, (Q) holds with 
C = 1 if = log(n) log(log n) . 

If we assume that !^ Bs~^~°' for some a > 0, then we have 

J2 \gs\=0{p~°') and npn^ Q'i = 0{np-'^°'). 

S>Pn S>pn 

The choice pn = with 2/?a > 1 then gives (1.3) and (Q) with C, = (5a — 1/2. 

Under (C) and (F), the density h is only guaranteed to be twice contin- 
uously differentiable. Thus, the optimal rate of nonparametric estimators 
like the kernel estimator h is Our estimator for h is h = f * g. We 

will show that its rate is Simulations in [33] for a related estimator 

in a first-order moving average process show that h is better than h, even 
for small sample sizes, and uniformly over a range of band widths. We note 
that our estimator h is easy to calculate. Indeed, h{x) can be written as the 
V-statistic 

-j^ n n 

[.n Pn) 

where Kb(x) = K{x/h) /b and K = k*k. Here, we used the fact that kb*kb = 
Kb- Thus, it is advantageous to choose a kernel k for which k* k is known. 

Smoothness of g and h can be linked to the number N. Our main result 
will thus be formulated in terms of N. The following conditions on the kernel 
and the bandwidth are kept general in order to allow for various smoothness 
assumptions in terms of an integer m > 2, where m — 1 will play the role of 
a (known) minimal size for A^. Under (C), we know that > 1, so we can 
always take m = 2. 

(B) The sequences 6„, p„ and q„ and the exponent C satisfy PnQnbn^x 
^ 0, nbl"^ = 0(1), n^/^Sn ^ and n^HnSn = 0(1), where Sn = 

bn"\~^l^+Pnqnbn''\-' + bn^'^^^-^l^ 
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(K) The kernel k has bounded, continuous and integrable derivatives up 
to order two and is of type (m, 2), as defined below. 

A kernel k is said to be of type {m, c) if / fk{t) dt = ion: i = 1, . . . , m and 
if / dt is finite. A kernel satisfying (K) can be chosen to be of the 

form p(j), where <j) is the standard normal density and p is an appropriate 
polynomial of degree m. 

A possible choice of bandwidth is 6„ ~ n~^^^'^"^\ Condition (B) is then 
met if 4mC > 1 and pnqnn~^'^^~^^^^^^^ — > hold. In particular, Pn = Qn^ 
requires that 8m/3 < 2m — 3. 

Let Gn, and Mn denote the processes defined by 

1 " 

Gn{x) = — — i9ix-^j)-E[g{x-ej)]), 

1 n 

i^-(^) = ;7— r E {f{x-Y,)-E[f{x-Y,)]), 

Pn 

^n{x) = J2{gi - gi)E[Xokb„{x - Y^)], 
1=1 

for X S M. Let || • || denote the supremum norm. We can now state our main 
result. 

Theorem 1. Suppose (I), (Q), (R), (S), (F), (K) and (B) hold. Let 
N>m-l>l. Then 

\\h- h-¥n - Gn + f * IH„|| = Op{n-^/^). 

The proof is an immediate consequence of the results in Sections 3-10. 
Write 

(L4) h-h = g*{f-f) + f*{g-g) + {f-f)*{g-g). 

Since / is L2-smooth and g is L2-smooth of order m — 1, as shown in Sec- 
tion 3, Lemmas 9 and 10 in Section 9 imply ||/ — f\\2 = Op{sn) + o{bn), 
while Lemmas 11 and 12 in Section 10 imply — g\\2 = Op{sn) + o(6™~^). 
Inequality (4.3) below and condition (B) then give 

(1.5) IK/ -f)^{g-g)\\< 11/ - fUg - gh = Op{n-^/^). 

We note that strong consistency of / was proved by Robinson [25, 26]. For 
(finite-order) nonlinear autoregressive models, convergence rates of residual- 
based kernel estimators were obtained by Liebscher [17] and Miiller, Schick 
and Wefelmeyer [23]. By the smoothness properties of /, g and h from 
Section 3, Theorem 4 in Section 9, applied with a = g, gives 

(1-6) ||5*(/-/)-G,|| = Op(n-V2) 
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and Theorem 5 in Section 10, apphed with a = f, gives 

(1.7) \\f*{g-g)-¥n + f'*Mn\\= Op(n-i/2). 

Theorem 1 now follows from (1.4)-(1.7). 

The sequences n ' G„ and nV2F„ are tight in Co(M) by Section 4. More- 
over, the sequence n^^"^ f *M.n is tight for the least squares estimators if (1-3) 
also holds. Indeed, according to Lemma 1 in Section 2, the above assump- 
tions imply that the least squares estimators satisfy 

1 " 

(1.8) A = M-i^— Yl X,_ie, + 0p(n-V2), 

" -P" i=p„+i 

where A = {qi - qi, . . . , Qp^ - Qp„y , Xj_i = {Xj_i, . . .,Xj_pJ~^ and M„ = 
£^[XoX^]. Thus, if (F) holds, then n^/^/^jj^ ^^^-^^ (27o(M) by Theorem 2 
in Section 7, applied with a = /'. Hence, v}/'^{h — h) is tight in Co(M) by 
the above Theorem 1 and h is n^/^.^ongistent in Co(M). Since the finite- 
dimensional marginal distributions of n^/'^{h — h) are asymptotically normal 
with mean zero, the process n^/'^(h — h) converges weakly in Co(M) to a 
centered Gaussian process with covariance 

r(s,t)= lim Cov(Z„(s),Z„(t)), s,t£R, 

n — >oo 

where 

1 " 

Zn{x) = ^Y.(9{x-ej) + f{x-Y,)-2hix) + ejX.J^,M~'E[-Kof{x-Yi)]). 

We pay a price for n^/^.^gnsistency in several respects. One is that we need 
stronger assumptions on the process, namely invertibility and a sufficiently 
fast decay of the autoregression coefficients, that is, condition (Q). Another 
is that we must choose, besides the bandwidth bn, the cut-off index pn- 
However, our estimator has the advantage that its asymptotic behavior does 
not depend on bfi and Pn^ least in the ranges we allow, while the rate of 
the usual kernel estimator depends on the bandwidth. 

If we strengthen (F) by imposing additional (smoothness) assumptions 
on /' and use kernels of type (r, 2) for appropriately chosen r, the bias 
terms in the estimation oi f, g and h can be made smaller, allowing for 
larger bandwidths and hence weaker assumptions. For example, if /' has 
bounded variation and a kernel of type (2m — 1,2) is used, then we can 
show that 11/ * kb„ - f\\2 = 0(6n^^), \\g * fcft„ - 5II2 = OibfT''^^'^) and \\h * 
~ ^11 = 0(6^"^"^) . This allows us to replace the requirements nb^ = 0(1) 
and n^/^6„s„ = 0(1) in (B) by nbf[^~'^ and nbf^ = 0(1). For the choice 
bn = (nlogn)^/(^™~2)^ ii^Q requirements of this modified condition (B) are 
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then implied by pnqn {log n)^^'^n~^"''~^^^^'^"''~^^ = 0(1). This allows for larger 
values of p„ and avoids additional assumptions on 

The paper is organized as follows. In Section 2 we comment more on the 
assumptions. We also look at the case where we have a parametric model 
for the autoregressive coefficients and give more details for classical models 
such as the AR(p), MA(1) and ARMA(1,1) models. In Section 3 we re- 
view expansions in Co(M) and Lp. In Section 4 we give a tightness criterion 
for sequences of Co (M) -valued random elements and sufficient conditions for 
tightness of empirical processes based on observations from linear processes. 
These are used in later sections to show tightness of n^^'^Yn, n^^'^Gn and 
n^/'^ f *Mn. An important inequality is established in Section 5. The asymp- 
totic behavior of averages of the form {n—p„)~^ J2]j=p„+i ^j-iO^nix — Yj) and 
their means is studied in Section 6. Such averages arise in the stochastic 
expansion of g. Tightness of n^^'^f * EI„ is established in Section 7. Sec- 
tion 8 shows how well the residuals approximate the true innovations and 
gives uniform stochastic expansions for residual-based averages of the form 
{n-pn)~^J2'j=p„+i"'nix-ij) and in-pn)~^J2'j=p„+iCi'nix-Yj). The kernel 

estimators / and g are of this form. In Section 9 we give convergence rates of 
/ in L2 and stochastic expansions of functionals a * / in Co (M) . Analogous 
results are given for g and a* g in Section 10. We have seen above how these 
results enter the proof of Theorem 1. 

2. Examples. The following result on the behavior of the least squares 
estimators is essentially contained in [4]. 

Lemma 1. Assume that (I), (1.3) and p"^/n ^ hold and that f has a 
finite fourth moment. Then expansion (1.8) is valid. 

Proof. The least squares estimators (^1, . . . , Qp^)'^ can be expressed as 



We can write the error term in (1.8) as (M„ ^ — ^)An — M„ with 



1 



n 



1 



n 





and 




By (2.13) of Berk [4] 
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and by the relation immediately preceding his (2.17), we have i?[|74„p] = 

0{pnn''^). By his Lemma 3, we have pl/'^\\M~^ — M~^||* = Op(l), where 
||M||* = sup|2.|<i \Mx\ is the operator norm of a matrix M. By his (2.14), 
both ||M„||* and ||M~-^||* are bounded. Combining the above, we obtain 

(M-i - M-i)A„ = o,(p-V2)o^(pi/2„-i/2) ^ o,{n-y'), 

\ \i>Pn 

The result follows. □ 

Of special interest is the case where we have a parametric model for the 
autocorrelation coefficients, that is, there are functions ri,r2,... from an 
open subset B of M'^ into M such that Qi = ri{-d) for all i and some unknown 
t? in G. We can then take Qi = ri('0) for all i and some estimator -d of -d. 
Now, let us impose the following conditions: 

(Rl) the estimator of "i? is n-'^/^-consistent, that is, — = Op(n~^/^); 
(R2) the functions ri,r2, . . . are differentiable at '& with gradients ri('f?), 
r2 ("!?), ■ • ■ and 

oo oo 

J2iri{^ + s) - ri{^) - Vii^)^ sf = o{\s\'^) and ^ |ri(??)p < oo. 

i=l i=l 

These conditions imply (R) with g„ = 1. If (C) and (F) are also met, one 
obtains (see Theorem 3 in Section 7) that 

||/'*BI„-(^-^)TA||=0p(n-i/2) 

with 

oo 

Aix) = Y,rimE[Xof{x-Yi)], x gM. 

1=1 

Thus, if (I), (Q), (Rl), (R2), (S), (F), (K), (B) and iV > m - 1 hold, we have 
the expansion 

(2.1) \\h-h-¥n-Gn + 0- ^VM\ = Opin-^/^) 

and tightness of n^^^{h — h). Weak convergence of v}^'^{h — h) in Co(M) can 
now be established under mild additional assumptions on {}. 

Let us now look at three special cases, namely AR(p), MA(1) and ARMA(1, 1). 
In these examples, the moving average and autoregression coefficients decay 
exponentially, so (S) holds and the choice p.„ ~ log(n) log(log(n)) guarantees 
(Q) with C = 1- We can then take m = 2 and 6.„ ~ n~^/^. 
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Example 1. Let Xt = -diXt-i + • • • + ^pXt-p + be an AR(p) pro- 
cess with 'dp ^ and such that the polynomial q{z) = 1 — J2^=i ^i^* 
no roots in the (complex) unit disk. Set ■!? = {'di, . . . ,'dp)~^ and Xt-i = 
(Xt-j , . . . , Xt-p)'^ . We can then write the model as = i^^X^-i + e^. The 
representation (1.2) holds with Qs = rsi'd) = "ds for s <p and Qs = Vg^-d) = 
for s > p. By our assumptions on q{z), the moving average representa- 
tion (1.1) holds with ifs being the coefficients of l/g{z) = Yl^iVsz'^ and 
Yf = Xt — £t = -d^ Xt_i. Since = is ruled out, we have (C). Moreover, the 
moving average coefficients decay exponentially, implying (S). Let be an 
n^/2_(jonsistent estimator of "d. We estimate the innovations £j by the resid- 
uals ij = Xj — 'i}~^Xj-i. Here, (R2) holds with rj('!?) = e^, the ith unit vector, 
for i <p and with ri(t?) = for i> p, and we find A(x) = E[Xof'{x — 'd^XQ)]. 
A simple estimator for is the least squares estimator 

(n \ ~1 n 

j=p+i J j=p+i 

With M = E[XoXj], has the stochastic expansion 

1 " 

With this choice of {}, we obtain, in particular, that v}^'^{h — h) converges 
weakly in Co(M) to a centered Gaussian process. In this example, we can 
take pn = p. 

Example 2. Let Xt = £t + '&£t-i be an MA(1) process with < 1 and 
1? 7^ 0. The moving average representation (1.1) then holds with Lpi = '& and 
ifs = for s > 1, and (C) holds, as "d ^ 0. The representation (1.2) holds 
with Qs = rs{'&) = —{—'dy. Let be an n-'^/^.^gnsistent estimator of 'd. We 
estimate the innovations ej by the residuals ij = Xj + Y^=i{~^y -^j-i- It is 
easy to check that (R2) holds with fs{'&) = s{—'dy . We have Yt = Xt — £t = 
'det-i and, therefore, E[XQf'{x — Yi)] = for i > 1. Thus, the expansion 
(2.1) holds with A(x) = E[Xof'{x - Yi)] = E[eaf'{x - ^eq)]. In particular, if 
1? is asymptotically linear, then n^^'^(h — h) converges weakly in Co(M) to a 
centered Gaussian process. Our estimator h is asymptotically equivalent to 
the estimator 

hscix) = J f{x- ^y)f{y) dy 
considered by Saavedra and Cao [31]. This estimator can be written 



10 



A. SCHICK AND W. WEFELMEYER 



with Li){x) = J k{x — •dy)k{y) dy. The kernel can be replaced by a general 
(nonrandom) kernel k. The U-statistic version of the resulting estimator, 

n 

hsW = 51 [X-Ei- hj), 

is studied in [33] , where a pointwise version of the above stochastic expansion 
is proved. Schick and Wefelmeyer [35] generalize the result to MA{q) and 
show that the expansion holds uniformly and in Li. 

Example 3. Let Xt = aXt-i + et + Pet-i be an ARMA(1, 1) process 
with |a|,|/3| < 1 and a + /5 7^ 0. The moving average representation (1.1) 
then holds with ips = {a + /?)a'^~^ and the autoregressive representation 
(1.2) holds with Qs = rs(a,/3) = (q + /?)(— /3)^~^. The requirement that a + 
^ / gives (fi^O and, therefore, (C). We have Yt = Xt - St = E^i(a + 
P)a^~^et-s- Let a and (3 be n-^/^-consistent estimators of a and respec- 
tively. We estimate the innovations £j by the residuals 

Pn 

ij=Xj-{a + $)Y,{-py~'X,^i. 

i=l 

Here, (R2) holds with r,(a, /3) = ((-/?)^-\ -(s - l)a(-/3)"-2 + s(-/3)"-i)^. 
Thus, the expansion (2.1) holds with 'd = (d, /?)"'" and 

A(:^) = £(-(.- l)ai-pr-^\ si-PY-^ ) - 

In particular, if a and $ are asymptotically linear, then 'n}/'^(h — h) converges 
weakly in Co(K) to a centered Gaussian process. 

3. Smoothness. Here, we shall address smoothness of /, g and h = f * g. 
For this, we assume that N >r for some positive integer r. We can then 
express Yq = J2i=i ^n^-Ti + where ti, . . . ,Tr are the indices of the first 
r nonzero terms among {ips ■S>1} and Z = J2s>Tr ^s£-s- For t^O, define 
densities ft and ft by ft{x) = f{x/t)/\t\ and ft{x) = E[ft{x - Z)]. Since the 
innovations are independent with density /, we find that the density 5 of Yq 
equals if r = 1 and equals the convolution /^^ * • • • * frr-i * frr if r > 1. 

Let A denote the class of absolutely continuous functions with a bounded 
and integrable almost everywhere derivative. Let Ap denote the class of 
absolutely continuous functions with an almost everywhere derivative in Lp, 
p G [1,00). It follows from (F) that / belongs to A and, hence, to Ap for 
each p £ [1,00). Elements of A are Lipschitz, while elements a of Ap are 
Lp-Lipschitz with constant C = ||a'||p, that is, 

||a(- -t) - a||p < C|t|, teR. 



ROOT-Af CONSISTENT DENSITY ESTIMATORS 



11 



Indeed, we can express 

a{x + t) — a{x) = t / a'{x + st)ds 
Jo 

and thus obtain from Jensen's inequality and Fubini's theorem that 
J \a{x + t)-a{x)\Pdx<\t\P £ J \a'ix + st)\Pdxds = \t\P\\a'\\P, t G M. 

A more careful analysis shows that elements a of Ap are Lp-smooth, 

\\a{- -t) -a + ta'\\p < \t\wp^a' {\t\) , t £ M. 

Here, Wp^y denotes the Lp-modulus of continuity of a measurable function v, 
defined by 

Wp^v{S) = sup \\v{- — t) — f lip, 5 > 0. 

\t\<5 

If V belongs to Lp, then Wp^^ is bounded by 2||t;||p and Wp^y{6) — > as (5 ^ 0, 
by the translation continuity in Lp, for which we refer to Theorem 9.5 in 
[30]. Also, recall that the modulus of continuity of a function v is defined by 

Wv{S) = sup \v{y) — v{x)\ < sup \\v{- — t) — v\\, 6>0. 

x,ym,\y-x\<S \t\<S 

If V belongs to Co(M), then is bounded by 2||f || and Wv{6) — > as 5 — > 0. 

Assume now that / belongs to A. Then the densities ft and ft for t^O will 
also belong to A. This immediately gives that g belongs to ^ if r = 1. Hence, 
g is Lp-smooth for each 1 < p < oo. Now, assume that r > 1. Set gi = fr_^* 

■ • •*/ri */r,+i *• • •*/r,_i */r, for i = 1, . . . , r - 1 and = /ri *• • •*/r._i */r,- 

These functions are integrable, bounded and uniformly continuous. The last 
two properties stem from the fact that the convolution of a bounded function 
u with an integrable function v is bounded and uniformly continuous in view 
of the bounds ||u * f || < ||ti|| \\v\\i and Wu*v{S) < ll^ll^i,i;(^)- It is now easy to 
check that gi is the ith derivative of g. Thus, we have the identity 

g{x + t)- g{x) - V -gi{x) = — / {gr{x + st) - gr{x))r{l - sf''^ ds. 

~[ "T- Jo 

Since gr belongs to Lp, we obtain from Jensen's inequality and Fubini's 
theorem, as above, that 



(3.1) 



g{- + t)-g-Y,-,9i 



. ^ iV 

1=1 



\tY 

<-^Wp,gr{\A). tG. 

p 



If (3.1) holds and gr G Lp, then we say that g is Lp-smooth of order r. This 
property reduces to Lp-smoothness if r = 1. 
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Since h equals f*g, the above arguments show that h is (r + l)-times con- 
tinuously difFerentiable with bounded, integrable and uniformly continuous 
derivatives. This implies that 



(3.2) 



r+l ^^ 



h{- + t)-h-Y,-h^^ 



1=1 



|r+l 

(r+ryi 



If (3.2) holds and h^'^'^^^ is bounded and uniformly continuous, we say that 
h is smooth of order r + l. 

Let us now summarize our findings. 

Corollary 1. Let f belong to A and N > r > 1. Then f is L2-smooth, 
g belongs to A and is L2- smooth of order r and h is smooth of order r + l. 

Corollary 2. Let a be L2- smooth of order r and let k be a kernel of 
type {m, 2) with m>r. Then \\a * k^,^ — a\\2 = oih^^) . 

Corollary 3. Let a be smooth of order r and let k be a kernel of type 
(m, 1) with m > r. Then \\a * ki,^ — a\\ = o{b'^). 

4. Weak convergence in Co(M). In this section, we address weak con- 
vergence of sequences of random elements in the space Cq (M) of continuous 
functions vanishing at (plus and minus) infinity, endowed with the supre- 
mum norm || • ||. To establish tightness, we use the following characterization 
of compact subsets of Co(M). 

Lemma 2. A closed subset A o/Co(M) is compact if and only if 
limsup sup \a{z) — a{y)\ = 0, 

^10 aeA \z-y\<5 

lim sup sup \a{z)\ =0. 

K^°^a€A \z\>K 

A proof of this lemma is given in [34]. From the lemma, we immediately 
obtain the following characterization of tightness. 

Corollary 4. A sequence A„ of Co{M) -valued random elements is tight 
if and only if for every e > and rj > 0, there exist a 6 > and a K < 00 
such that 



(4.1) supP( sup \Aniz) - Aniy)\> e) <r], 

^\z-y\<S 



(4.2) supP( sup \An{z)\>e] <r]. 

^\z\>K 
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Once tightness is established, weak convergence follows from the conver- 
gence of the finite-dimensional distributions. 

Let oi and 02 be two square-integrable functions. Then oi * 02 belongs 
to Co(M). Indeed, an application of the Cauchy-Schwarz inequality and a 
substitution yield 

(4.3) ||ai * 02!! < ||ai||2||a2||2- 
Hence, ai * 02 is bounded. Furthermore, 

(4.4) ||ai * a2(- -t) - ai* a2\\ < \\ai{- - t) - ai||2||a2||2- 

Since ai is square-integrable, we obtain from the translation continuity of 
square-integrable functions (see, e.g., [30], Theorem 9.5) that ||ai(- — t) — 
fflilb — > as t ^ 0. This shows that ai * 02 is uniformly continuous. Finally, 
write XK{y) = M\y\ > K] and ai*a2 = ai* (02(1 - Xi^ )) + * {0'2Xk)- Since 
Ix — 2/1 > K if \x\ > 2K and \y\ < K, we obtain 

(4.5) sup |ai * a2(2;)| < ||aiXi^||2||a2||2 + ||ai||2||a2Xi^l|2- 

\x\>2K 

Hence ai * 02 vanishes at infinity. The above shows that ai * 02 is in Co(M). 

If a is a square-integrable function and D„ is a sequence of L2-valued 
random elements, then inequalities (4.3)-(4.5) yield 

||a*D„(- -t) -a*D„|| < ||a(- - t) - a||2||B„||2, 

sup \a*Bnix)\ < WaxKhW^nh + llalbll^nXi^lb- 

\x\>2K 

This shows that the Co(M)-valued sequence a*B„ is tight if ||B.„||2 = Op(l) 
and if for all positive e and r/, there exists a K such that sup„P(||D„X-R'l|2 > 
e) < 77. In view of the Markov inequality, a sufficient condition for these two 
statements is the following condition. 

(T) There exists an integrable * such that i?[B^(x)] < ^(x) for ah x e M. 

Now, let ^1,^2, ••• be a stationary sequence of random variables with dis- 
tribution function D and let 

n 

D„(x) =n-^/2^(l[^j <x] -L»(x)), xGM, 
j=i 

be the associated empirical process. If A is absolutely continuous with an al- 
most everywhere derivative A' that is both integrable and square-integrable, 
then we can express 
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as 

An{x) = J A'{x - y)On{y) dy = A' * D„(x), x G M. 

Thus, the sequence A„ will be tight if we can show that condition (T) holds. 
In the following, we give sufficient conditions for (T). 

(a) If ^i,i^2)--- are independent, then condition (T) holds if the ran- 
dom variables have a finite mean. Indeed, we have the identity E[Bl{x)] = 
D(x){l — D{x)) and D{1 — D) is integrable if and only if the have finite 
mean. 

(b) Now assume that ^i,^2i • • • come from a linear process 

oo 
s=0 

where the innovations Ut, t G Z, are i.i.d. with finite mean, the coefficients 
dQ,di, . . . are summable and do ^ 0. Then condition (T) holds if X)s^o(-'^ + 
s)\ds\ < oo. This follows from Corollary 7.1 in [36]. 

5. A bound. Let Ut, t £ Z, be independent and identically distributed 
random variables with finite mean. For summable coefficients co,ci, . . . and 
do, di, . . . with do / 0, let us consider the linear processes 

oo oo 

St = ^CsUt-s and Tt = ^dsUt-s, t£Z. 

s=Q s=0 

For a measurable function a, we define 

n 

=n~i/2^(a(x-T,) -E[a(x-rj)]), 

n 

H{x) = n-^/2^(5ja(x - Tj) - E[Sja{x - Tj)]), x G M. 
i=i 

Let U = Uq and set 

oo oo oo oo 

« = $Il'^il D = '^{j + l)\dj\ = '^^\ds\. 

j=0 j=0 j=Os=j 

In their Lemma 7.3, Schick and Wefelmeyer [36] show the following result. 

Lemma 3. Suppose a is bounded and Li-Lipschitz with constant L. Let 
D be finite. Then 

J E[K'^{x)]dx < AL\\a\\DE[\U\]. 
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We shall now obtain a similar result for the process H . 

Lemma 4. Suppose a is bounded and Li-Lipschitz with constant L and 
U has a finite second moment. Let D be finite. Then 

E[H'^{x)]dx<8L\\a\\a'^DE[\U\]E[U'^]. 

Proof. We can write H{x) = n"^/^ J2]=i{Zj{x) - E[Zj{x)]), where 
Zj {x) = Sja{x — Tj), X G M. 

Now, set 

Sj — ^ ^ CgUj—s, Sj — ^ ^ CgUj—si 

s=0 s=j 
j-1 oo 

s=0 s=j 

We can then write 

Zj{x) = S*a{x — T* — Tj) + Sja{x — T* — Tj) 

and obtain, with denoting the u-field generated by {Ut-t<0}, that 

(5.1) Zj{x) = E{Zj{x)\T) = a*j{x - Tj) + Sjaj{x - Tj), 

where aj and aj are the functions defined by 

aj{x) = E[S*a{x-T*)] and Oj = E[a{x - T*)], x G M. 

These functions inherit the Li-Lipschitz property of a. More precisely, we 
have the bounds 

||a*(--t)-a*||i <^[|5*|]L|t| <BL\t\ and 

(5.2) 

\\aj{- — t) — aj\\i < L\t\, 

where B = aE[\U\]. To simplify notation, we abbreviate 5o by S, Tq by T 
and Zq by Z. Using stationarity and a conditioning argument, we obtain 

2 n—l oo 

E[H\x)]=VariZix)) + - J] (ji - j) Cov(Z(x), (x)) < 2 ^ r,(x), 

i=i j=o 

where, in view of (5.1), Tj{x) can be taken to be 
Tjix) = E[\Z{x) - E[Z{x)]\\a*j{x - Tj) - a*(x) + Sj{aj{x - Tj) - aj(x))|]. 
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Since a is bounded, we derive the bounds |^(a;)| < l-SIHaH and |-E[Z(x)]| < 
S[|5|]||a|| for xGM. This, E[\S\] < B = aE[\U\] and (5.2) yield that 

llr.lli < \\a\\E[{\S\+E[\S\]){BL\Tj\+LE[\SjfM 

< \\a\\BL(Y.\ds+j\E[{\S\+E[\S\])\U^s\] + 2j2 h\\ds\E[U^]\ 

\s>0 s,t>j / 

< \\a\\BL{2aE[U^] +2aE[U^])Y,\ds\- 

■i>j 

In view of B = aE[\U\] and the definition of D, the desired result is now 
immediate. □ 

6. An auxiliary result. Let Xt be a linear process as in (1.1). Let be 
an integrable function that belongs to ^i. For i = 1,2, . . . , set 

1 " 

an,i{x) = Xj_ian{x -Yj), xeR, 

n- Pn ■ 1 

^ J=Pn + l 

dn,i(x) = E[an,i{x)\ = E[XQan{x - Yi)], rr e M. 

In this section, we study the behavior of dn^i and its expectation an,i in L2. 
The results developed here will be used in later sections with a„ = k^,^ or 

From Lemma 4, we immediately obtain the following result. 

Lemma 5. Suppose (C) and (S) hold. Then there exists a finite constant 
A such that 



/va.«.W)<i.<A||a„||||„:.||, 



i = l,2,.. 



We denote the index of the first nonzero moving average coefficient by 

r = inf{s> l:(/3, /O}. 

Under (C), r is finite. Let Zj = Yj — ipr^j-T- A conditioning argument shows 
that 

dn,i{x) = l[i = T]E[Vn{x - Zi)] + E[XQUn{x - Zi)\ 

with 

Un{x) = E[an{x - LprEQ)] and Vn{x) = E[eQan{x - iprEo)], XGM. 
Then Un = an* V'o and Vn = an * ipi, where 

(6.1) Mx) = T^f(—) and ^l;,{x) = -^ — f(—), x G M. 
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Under assumption (F), ipQ and ipi belong to A. 

If Un converges in L2 to some u and Vn converges in L2 to some v, then 
we find ttiat a^.i converges in L2 to Oj, where 

ai{x) = l[i = t]E[v{x - Zi)\ + E{Xqu{x - Zi)\ x G M. 

Actually, a stronger statement is possible. 



Lemma 6. Let (C), (S) and (F) /loW. Suppose that there exist square- 
integrahle functions u and v with u in A2 such that ||a„ * ipi — v\\2 — > 0, 
||on * V'o — u\\2 and ||a„ * V'o ~ '"'II2 0. T/ien 
00 00 
^ ||a„_j — Oi||2 — > and ^||aj||2<oo. 
1=1 1=1 

Proof. For i > t and w G A2, we have 

E[Xou;(2; - Z,)] = E[Xo{w{x - Z,) - w{x - Zi))] 
with Zi = Y.T<s<i fs£i-s and, hence, 

J {E[Xow{x - Z{)]f dx < E[X^] J E[{w{x - Zi) - w{x - Zi)f] dx 

<E[Xl]\\w'\\lE[{Z^,-Zi)'] 
00 

= E[Xl]\\w'\\lE[el]Y.^l 

s=i 

With w = an * tpQ — u and assumption (S), we obtain 

and with w = u, we obtain 

E < E[Xi]E[el]\\u'g E < oo. 

i>T S>T 

The desired results are now immediate, as an,i converges in L2 to Oj for 

i<T. a 



Remark 1. The assumptions on a„ of the previous lemma hold with 
u = a* ipQ and v = a* ipi if a„ converges in L2 to some a. They hold with 
u = 'ipQ and V = 1^1 if a„ = kh^ . In the first case, di = a*6i, and in the second 
case, di = 6i, where 



(6.2) 



6i{x) = l[i = t]E[Mx - Zo)] + E[XoMx - Z{)]. 
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7. Tightness of M^. Let us now address tightness of n^^'^a * 

for some square-integrable a. For such an a, we have, with o„ = a * /c^^, 

Pn 

a * Unix) = - Qi)E[Xoan{x - Yi)] = A^E[Xoa„(x - n)], x G M. 

i=l 

Recall that A = (^i - ^i, . . . , Qp^ - gpj~^ and Xj_i = . . .,Xj_pJ^. 

We shall first treat the case where (1.8) holds. As seen in the proof of 
Lemma 1, the dispersion matrix M„ = £'[XoX([] is invertible and the oper- 
ator norm of its inverse is bounded. Hence, there exists a constant K 
such that for all n, 

(7.1) clM.^Cn<K\cn\^ and clM-^Cn < K\cn\^ c„eRP". 
Let 6 = {6i, . . . , 5p„)^ with 6i as defined in (6.2). Now, set 

1 " 

J„(x) = y e.x7_iM^M(x), x£R. 

n — Vn ■ 1 

^" J=Pn + l 

We point out that for any square-integrable a, 

1 " 

a*JI„(x) = — — ^ ejXj_iM-^^[Xoa(x-yi)], x e M. 



Theorem 2. Lei (C), (I), (F), (S) and (1.8) hold andpn^oo. Then, 
for each square-integrable a, the sequence n^^'^a * JJ,„ is tight in Co(M) and 
\\a*{mn-Sn)\\=Op{n~^/^). 

Proof. Since Hn,i{x) = E[XQkij^{x - Yi)] equals E[Xi^ikb^{x -Yi)], we 
obtain that EI„ = A'''//„, where Hn{x) = E\X.Qk}y^{x — Yi)]. Let us set 

A = M-^^— y X,_ie,-. 

By the results in Section 6, we have, with Vn = kh„ * V'l and Un = kb^ * ^|Jo, 
that 

finA{x) = l[i = T]E[Vn{x - Zq)] + E[XoUn{x - Zi)\. 

Since \k\,^ * ipi — V'ilb — > for i = 0, 1 and \\kh„ * i^'o — V'olb ~^ 0) we obtain 
from Lemma 6, applied with a„ = ki,„ , that 

oo oo 

X! - ^ and y ||(5i||2 < oo. 

i=l i=l 
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From this, we obtain that ||/^n||2 = 0{1). This shows that 
(7.2) ||HI„ - AVnlb = ||(A - A)Vn||2 < |A - A|||//„||2 = Op(n-i/2). 
A martingale argument and straightforward calculations show that 
(n - Pn)E[f^{x)] = E[e2]E[(X^M-M(x))2] 

= E[el]E[S{x)'^M-'Xo:^'^M~'S{x)] 
= E[el]S{xyM-^MnM-^Six). 

This shows that 

oo 

in-pn)E[llix)]<E[el]KY,5Kx). 

i=l 

Since J2i^i ^1 is integrable, n^/^a * J„ is tight by the results in Section 4. 
Since = ki,^ * 5i, we find that a * (A~'"/i„) = ki^^ * a * J„. Thus, by the 
tightness of n^/^a * J[„, we obtain that \\a * {A^ Hn) — a * Jn|| = Oj 
This and (7.2) establish n^/'^\\a * {Mn - In)\\ = Op{l). □ 



Now, let us look at the case of parametric autocorrelation coefficients as 
described in Section 2. We then have Qi = rj(i9) and Qi = rj(i?). We assume 
that (Rl) and (R2) hold. This gives the expansion 

Pn 



1=1 



Fix a square-integrable a. Under (C), (S) and (F), we have 



X] A^n.i -a*5i\\'^ < ||a||i^ \\fin,i - S, 



2 

«II2 



i=l 



i=l 



and 



j=l 2=1 

Using the Cauchy-Schwarz inequality, we find that 

2 r\r) 



Pn 



1=1 
and 



< 



RnY\ 

1=1 



a* fJ"n,i\ 



Pn 

^ri(t?)a*^„ j -^rj(i?)a*(5j 
1=1 



i=l 

f Pn 



<^|ri('(?)n ^||a*/i„,i-a*(5i||^+ ^ ||a*(5i 

i=l \i=l i=pn+l 



0, 
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provided pn ^ oo. This shows that under (C), (I), (F), (Rl), (R2) and (S), 
we have 



1=1 



Since a * 5i{x) = E[XQa{x — 1^)], we have the following result. 

Theorem 3. Suppose that (C), (I), (F), (Rl), (R2) and (S) hold and 
that Qi = ri{'d) and Qi = ri{d). Let pn — > c>o. Then \\a * EI„ — (i? — = 
Op(n~^/^), where 



A{x) = Y,rii^)E[Xoa{x-Y,)], 



X G M. 



1=1 



If ri(i?) = for all i > p, as is the case in the AR(p) model, then the 
requirement that pn ^ oo can be relaxed to pn = P- 

8. Behavior of the residuals. In this section, we study how close the 
residuals are to the actual innovations. Recall that A = (^i — qi, . . . , Qp^ — 
QpnV ^'^d -^j-i — (^j-i' • • • j^i-Pn)"''- Note that condition (R) is equivalent 
to |Ap = Op(g„n-^). Under (I), we also have 



X 



n-pn 



j=Pn + l 



This follows since we have 



{n-pn)E 



^ J=Pn + l 



< CE[Xl] 



for some constant C independent of n and i. Thus, we derive 

(8.2) A^X = Opipl/Vj'n-^)- 

The residuals can be expressed as 



Pn 



Pn 



i=l 



1=1 



i>Pn 



i>Pn 



where 
(8.3) 



= ~^(Bi- Bi)Xj-i = Ej - A'^Xj_i. 
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Lemma 7. Suppose that (I), (Q) and (R) hold. Then 

n 

(8.4) ii,-i-f = 0,in-'^), 

j=Pn + l 

n 

(8.5) {i*-e,f = Op{pnqn). 

i=Pn + l 

1 " 

(8.6) — Y (e,-e,)=0p(n-V2-C)+ 0^(^1/2^1/2^-1). 

If the innovations have a finite moment of order ^ > 2, then 

(8.7) max \ij -ej\= Op{n~^) + Op{p}/^ql/^n~^/^+^/^ 



Pn<j<n 



Proof. It follows from the Cauchy-Schwarz inequality that 

Pn Pn 



(8.8) (f*_,^.)2<^(^._^^)2^^2_._ 

i=l i=l 

From this bound, assumption (R) and the fact that -^[^o] ^ obtain 

n 

(8.9) Y (^i ~ ^i)^ = Op{qnn~^)Op{pnn) = OpiPnQn)- 

j=Pn + l 

It follows from the Minkowski inequality that the L2(P)-norm of ij — ij = 
J2s>pn QsXj-s is bounded by the L2(P)-norm of Xq times J2s>pn \0s\ - Thus, 

/ \ 2 



E 



E ( 



£*\2 



<nE[Xl]i Y \Qs\) =0{n-'^) 



\-j=Pn + l 

which implies (8.4). It follows from (8.4) that 

(8.10) max — e*| = Op(n""^), 

Pn<j<n 

(8.11) ± (f,-e*) = 0,(n-V2-C). 

^ Pn j=p„+i 

Indeed, the square of the left-hand side of (8.10) is bounded by Rn, the 
left-hand side of (8.4), while the squared error term of (8.11) is bounded by 
Rn/ {n — Pn)- Thus, (8.6) follows since, by (8.2), we have 

(8.12) — - Y {i*-ej) = -A''X = Op{p}/VJ'n~'). 
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The additional moment assumption on the innovations gives £'[|Xo|^] < 
oo. From this, we obtain that maxi<j<„|Xj| = Op{n^^^). Indeed, for each 
?7>0, 

max \Xj\ > r]n^/^^ < J2P{\Xj\ > r]n^/^) < r]~^ E[X^l[\Xo\ > rjn^/^]]. 
It follows from this, inequality (8.8) and assumption (R) that 

Pn 

(8.13) max |e* - <p„ V(^j - ^ij)^ max = Op(p„g„n"^+^/'^). 

1=1 

Combining (8.10) and (8.13), we obtain (8.7). □ 

Lemma 8. Suppose that (I), (Q) and (R) hold. Let Un be a sequence 
of functions with bounded integrable derivatives up to order two such that 
||oJ^||=0(l) and \\a'^\\ = o{p~^q~^n^^'^) . Then 



sup 



^.14) 



1 



n-pn 



J2 Mx - Yj) - an{x - Yj) + A^Xj^ia'^ix - Yj)) 



j=Pn + l 

If, further, PriQu/n and \\a'^\\2 = o{pn^^'^ Qn^^'^n^^'^) , then 
1 



i.l5) sup 



n-pn 



{an{x - ij) - an{x - Ej)) 



j=Pn + l 

Proof. Note that (8.4) implies 
1 



Op(n-V2) 



^.16) 



Qr. 



n-pn 



j=Pn + l 



while (8.3) and (8.5) imply 
1 



n-pn 



J2 i^'^-^i 



^.17) 



■ i=Pn + l 

OpipnQnn'^)- 



1 



The expression following the supremum in (8.14) can be written as |r„(x)|, 
where 

1 " . 
r„(x) = — — Y {dnix -Yj) - Unix -Yj) + A^Xj^ia'nix -Yj)). 



n-p, 



j=Pn + l 
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Define r* as but with Yj = Xj — ij replaced by Xj — ij. Then 

\K-r*J<\\a'JQn = Opin-^-'/^a'J). 
A Taylor expansion yields the bound 

IKII < llflnll^n = Op{pnqnn~^\\a'l,\\). 

This establishes (8.14). The same arguments yield 



sup 



1 " 

- — {anix-ij)-an{x-ej)-A^Xj^ia'^Xx-£:j)) 



^" J=Pn + l 



Op(n-i/2). 



In view of (8.2), we have 

— 1/2 

Result (8.15) now follows if we can show that ||an|| = Op(9n ) for 
1 " 

It follows from Fubini's theorem that a„ = * Wn with 

1 " 

Wn{x) = V X,_i(l[ej <x]-F(x)). 

Thus, \\an\\ < lla^lbll^nlb- Since 

{n-pn)E[\\Wn\\l] = E[\Xo\^] J F{x){l-F{x))dx = 0{pn), 

we obtain ||q„|| =0p(pn'^n~^/'^\\a'l^\\2) =Op{qn^^^). □ 

9. Estimating the innovation density /. The kernel estimator based on 
the residuals is 

1 " 

f{x) = Y ^bA^-^j), 3;GM. 

3=Pn+l 

In this section, we study convergence of / in the space L2 and of functionals 
of the form a * / in the space Co(M). 

Let / denote the kernel estimator based on the actual innovations ep„+i, . . . ,£n, 

1 " 

f{x) = V kb„{x-ej), xeR. 

n-pn ■ 

The first result is known. 
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Lemma 9. Suppose that the kernel k is square-integrable and of type 
(m, 2). Let f be L2-smooth of order r <m. Then 

||/-/||2 = Op(6-i/2n"i/2) + o(6;). 

Proof. It is well known that E[f{x)\ = f * kh^^{x) and 

{n-pr.)E[\\f-f*kal]<\\kl*f\\i<b-'\\k\. 

Thus, \\f-f*kbj2 = Opibn'^\~^/^). By Corollary 2, - /lb = o{bl,). 

□ 

Lemma 10. Suppose that (I), (Q), (R), (F) and (K) hold. Then 

11/ - fh = OpipngnK'^'^n-') + Op(n-^-V25-3/2)^ 

Proof. Let ij be as in (8.3). Let /* denote the kernel estimator based 
on £^,^+1, ■ ■ ■ With Qn as in (8.16), we find that 

ii/-/*iiBii/-riiiii/-rii<ii^Liiiii^LiiQn 

and obtain, in view of (8.16), the rate 

\\f-h\2 = 0,{b-y'n-<-'/'). 

The identity e* = ej — A^Xj_i and a Taylor expansion yield f* — f = 
A^7n + rn with 

1 " 

rJx) = V / / (A^^i~iftk'/(x-ei + stA^Xj^i)dsdt. 

^-P",4:r+i-^o Jo 

With T„ as in (8.17), we obtain ||r„|| < = Op{pnqnbn^n~~^) and 

Iknili < WKJh'^n = Op{pnqnb~'^n-^), and, consequently, 

Iknlll < Iknillknili = Op{plqlb~^n~'^) . 

Let 7n = X A;;^ * /. Since \\k',^ * fh = \\f' * kkj2 < Il/'ll2||fc||i, we obtain 
from (8.2) that 

||A^7n||2 < \A'^M\ki* fh = Op{pl/Vj'n~'). 
A martingale argument yields 

(n - Pn)E[\\^n - Tnlli] < PnE[\Xi]\\{kiJ * /||i = Oipnb~'). 

Thus, ||A~''(7„ — 7n)||2 = Op{pl/'^ ql/'^bn^^'^n"^) . The above imply the desired 
rate. □ 
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Theorem 4. Suppose that (I), (Q), (R), (F) and (K) hold. Let a€A 
and let a* f be smooth of order r <m. Let the bandwidth satisfy nb"^ = 0(1) 
and p„(7.„6~^n~^/^ — > 0. Then 

||a*(/-/)-A„||=Op(n-i/2), 

where 

1 " 

A„(x) = ^ (a(x — Ej) — i?[a(3; — Ej)]), x G M. 

Proof. Let / = = / * . Since a * / is smooth of order r <m and 
k is of type (m, 1), Corollary 3 yields 

Ik * / - a * /II = \\{a*f)*h„-a* f\\ = o{b:,) = o(n-V2). 

We can write a* {f — f) = A„ * ki,^ . Since n^/^A„ is tight in Co(M) by result 
(a) in Section 4, we obtain that ||n^/^(A„ * ki,^^ — A„)|| = Op(l). In other 
words, 

||a*(/-/)-A,||=Op(n-i/2). 

We can now calculate that 

1 " 

a*{f-f){x) = V {an{x - ij) - an{x - Ej)), x G R, 

J=Pn + l 

with a.„ = a * /c^^ . a„ is then twice differentiable with = a' * /c^^ and a'^ = 
a'*k[^. We have" 1 1 a;, 1 1 < ||a'|| ||A;b„ ||i = 0(1), ||a^|| < ||a'|| ||A;^^ ||i = 0(6,7^) and 

IKIli < IKIIIIanlli < IKIIII^LIIilk'lli = ^(^n^)- Inviewofp„g„6-in-V2^o, 
Lemma 8 yields 

l|a*(/-/)ll = Op(n^'/=^). 
The desired result follows from the above. □ 

10. Estimating the density g. The kernel estimator based on the esti- 
mated versions Yj = Xj — ij of the Yj = Xj — ej is 

1 " 

g{x) = Kix-Yj), xeW. 

^ J=Pn + l 

In this section, we study convergence of g in the space L2 and of functionals 
of the form a * ^ in the space Cq (M) . Let g denote the kernel estimator based 

on yp„+i,...,y„, 

1 " 

g{x) = Kix-Yj), xeR. 

We first give an analogue of Lemma 9. 
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Lemma 11. Suppose that (C) and (S) hold. Let the kernel k be square- 
integrable and of type (m, 2). Let f belong to Ai n A2 and have finite mean. 
Let g be L2- smooth of order r with r <m. Then 

\\g-g\\2 = 0,{b-'/'n-y') + oib:,). 

Proof. By Corollary 2, we have \\g * kt,„ — g\\2 = 0(65^). It remains to 
show that 

(10.1) \\9-9*hj2 = Opib-'/^n-'/^). 

Recall the notation r = inf{s > 1 : v^s 7^ 0}. We can write Yj = iprSj-r + Zj 
with Zj = J2s>T ^s^j-s- Let o„ = kb„ * ipo^ where ipo is the density of iprSo- 
We can then express g — g * kb„ as the sum Ti + kb„ * T2 with 

1 " 

Ti{x) = — — ^ {kb^{x-Yj)-an{x- Zj)), 

1 " 

T2{x) = ^— {M^-Zj)-E[il;o{x-Z,)]). 

Using a martingale argument, we obtain (n — Pn)-£'[||7"i H^] < *5||i = 
0{h~^) and thus ||Ti||2 = Opipn^^'^n''^/'^). Since / belongs to Ai r\ A2-, so 
does V'o- Thus, n^/'^T2 is tight by result (b) in Section 4, applied with A = V'o 
and ^j = Zj. This shows that \\T2*h„\\l < ||r2||^||A:fe„||i < HTzU ||r2||i||/c|| = 
Op(n-^/2)_ This finishes the proof of (10.1). □ 

Let us define functions Hn and fi'^ by 

Unix) = E[Xokb^^{x - Yi)] and fi'^{x) = E[Xok'f,^{x - Yi)]. 
We now give analogues of Lemma 10 and Theorem 4. 

Lemma 12. Suppose that (C), (I), (Q), (R), (S), (F) and (K) hold. Then 
\\g-g + A^a4||2 = Op{pnqnb-'^''n-') + Opin-<:-'/X'^^)- 

Proof. Let g* denote the kernel estimator based on ■ ■ ■ ,Y* with 

As in the proof of Lemma 10, we find that 

\\g-g*h = Op{n-'^-'/X^/^) and 
\\g*-g + A^fi'nh = OpipnQnb-^^^n-^), 
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where 



1 " 



^ 3=Pn+l 

Note that ||A;^^ || = 0{b~^) and ||A;^^ || = 0{b~^). Thus, it follows from Lemma 5, 
applied with an = A;^^ , that 

i?[||AUx) - E[f,'^{x)] f] dx = 0{pnb-'n-^). 

Since fi'nix) = E[jl'^{x)], we see that 

The above rates yield the desired result. □ 

Theorem 5. Suppose that (C), (I), (Q), (R), (S), (F) and (K) hold. 
Let a^ A and let a* g he smooth of order r with r <m. Let the bandwidth 
satisfy nb"^' = 0(1) and Pnqnbn^n"^/^ 0. Then 

\\a *ig-g)- IfC„ + a' * (A^ Hn)\\ = Op(n-^/^), 

where 

1 " 

Proof. Set g = E[g] = g * kj,^ . Since a * g is smooth of order r and the 
kernel k is of type {m, 1) with m>r, we obtain from Corollary 3 that 

\\a*g-a*g\\ = \\{a*g) * -a*g\\ = o{bl) = o{n~'^^^). 

Simple calculations yield a* {g — g) = IC„ * k^^ . Since a belongs to Ai n A2 
and / has finite mean, it follows from (S) and result (b) in Section 4 that 
n^/'^Kn is tight in Co(lR). Consequently, \\n^/'^{Kn * - ]K„)|| = Op(l). In 
other words, 

Ik *{g-g)- = Op(n"^/^). 

With a„ = a * k^^ , one verifies that 

1 " 

a*{g - 9){x) = V {an{x -Yj) - Unix -Yj)), x G R. 

n — Pn ■ 1 



Now, let 



1 

fin{x) = y] Xj_i/cfe„(2; - Y,), xG 

n- Pn . -, 



28 



A. SCHICK AND W. WEFELMEYER 



Since ||a^|| = 0(1), ||a^|| = 0(6~^) and ||a^||2 = 0{b~^), as shown in the 
proof of Theorem 4, and since PnQnbn^n~^^'^ — > 0, we obtain from Lemma 8 
and = a' * kb„ that 

\\a *ig-g) + a'* (A^/i„)|| = Op{n~^/^). 

It follows from Lemma 5, \\kb^\\ = 0{b~^) and ||A;b,J|i =0(1) that 

J E[\\fin{x) - E[fin{x)]\\^] dx = Op{pnb-^n-^). 
Since ^n{x) = ii^[/in(x)], we find that 

\\a' * hJ{(in - ^ln)\\ < ||a'||2|A| - finh = Op{pl/'^qi^%^^^n-^) 

The desired result follows from the above. □ 

Acknowledgments. We thank an Associate Editor and two referees for 
suggestions that led us to completely rewrite this paper. We had originally 
introduced a more complicated ji^/^-consistent density estimator that in- 
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